AITopics | value decomposition

Collaborating Authors

value decomposition

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Value Function Decomposition for Iterative Design of Reinforcement Learning Agents

Neural Information Processing SystemsDec-24-2025, 04:36:36 GMT

Designing reinforcement learning (RL) agents is typically a difficult process that requires numerous design iterations. Learning can fail for a multitude of reasons and standard RL methods provide too few tools to provide insight into the exact cause. In this paper, we show how to integrate \textit{value decomposition} into a broad class of actor-critic algorithms and use it to assist in the iterative agent-design process. Value decomposition separates a reward function into distinct components and learns value estimates for each. These value estimates provide insight into an agent's learning and decision-making process and enable new training methods to mitigate common problems.

iterative design, reinforcement learning agent, value function decomposition, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.79)

Add feedback

ff4d5fbbafdf976cfdc032e3bde78de5-Supplemental.pdf

Neural Information Processing SystemsAug-19-2025, 03:07:15 GMT

artificial intelligence, machine learning, operator, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Value Function Decomposition for Iterative Design of Reinforcement Learning Agents

Neural Information Processing SystemsAug-14-2025, 18:20:49 GMT

Despite these successes, applying RL techniques to complex control problems remains a daunting undertaking, where initial attempts often result in underwhelming performance.

agent, reward component, value decomposition, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)

Industry: Leisure & Entertainment (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Concept Learning for Cooperative Multi-Agent Reinforcement Learning

Ge, Zhonghan, Zhu, Yuanyang, Chen, Chunlin

arXiv.org Artificial IntelligenceJul-29-2025

Despite substantial progress in applying neural networks (NN) to multi-agent reinforcement learning (MARL) areas, they still largely suffer from a lack of transparency and interoperability. However, its implicit cooperative mechanism is not yet fully understood due to black-box networks. In this work, we study an interpretable value decomposition framework via concept bottleneck models, which promote trustworthiness by conditioning credit assignment on an intermediate level of human-like cooperation concepts. To address this problem, we propose a novel value-based method, named Concepts learning for Multi-agent Q-learning (CMQ), that goes beyond the current performance-vs-interpretability trade-off by learning interpretable cooperation concepts. CMQ represents each cooperation concept as a supervised vector, as opposed to existing models where the information flowing through their end-to-end mechanism is concept-agnostic. Intuitively, using individual action value conditioning on global state embeddings to represent each concept allows for extra cooperation representation capacity. Empirical evaluations on the StarCraft II micromanagement challenge and level-based foraging (LBF) show that CMQ achieves superior performance compared with the state-of-the-art counterparts. The results also demonstrate that CMQ provides more cooperation concept representation capturing meaningful cooperation modes, and supports test-time concept interventions for detecting potential biases of cooperation mode and identifying spurious artifacts that impact cooperation.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2507.20143

Country: Asia > China (0.16)

Genre: Research Report > New Finding (0.46)

Industry: Transportation (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Multi-agent Markov Entanglement

Chen, Shuze, Peng, Tianyi

arXiv.org Machine LearningJun-30-2025

Value decomposition has long been a fundamental technique in multi-agent dynamic programming and reinforcement learning (RL). Specifically, the value function of a global state $(s_1,s_2,\ldots,s_N)$ is often approximated as the sum of local functions: $V(s_1,s_2,\ldots,s_N)\approx\sum_{i=1}^N V_i(s_i)$. This approach traces back to the index policy in restless multi-armed bandit problems and has found various applications in modern RL systems. However, the theoretical justification for why this decomposition works so effectively remains underexplored. In this paper, we uncover the underlying mathematical structure that enables value decomposition. We demonstrate that a multi-agent Markov decision process (MDP) permits value decomposition if and only if its transition matrix is not "entangled" -- a concept analogous to quantum entanglement in quantum physics. Drawing inspiration from how physicists measure quantum entanglement, we introduce how to measure the "Markov entanglement" for multi-agent MDPs and show that this measure can be used to bound the decomposition error in general multi-agent MDPs. Using the concept of Markov entanglement, we proved that a widely-used class of index policies is weakly entangled and enjoys a sublinear $\mathcal O(\sqrt{N})$ scale of decomposition error for $N$-agent systems. Finally, we show how Markov entanglement can be efficiently estimated in practice, providing practitioners with an empirical proxy for the quality of value decomposition.

artificial intelligence, data mining, machine learning, (13 more...)

arXiv.org Machine Learning

2506.02385

Country:

North America > United States > New York > New York County > New York City (0.04)
Oceania > New Zealand (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Kenya (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Generalized Fisher-Weighted SVD: Scalable Kronecker-Factored Fisher Approximation for Compressing Large Language Models

Chekalina, Viktoriia, Moskovskiy, Daniil, Cherniuk, Daria, Kurkin, Maxim, Kuznetsov, Andrey, Frolov, Evgeny

arXiv.org Artificial IntelligenceMay-26-2025

The Fisher information is a fundamental concept for characterizing the sensitivity of parameters in neural networks. However, leveraging the full observed Fisher information is too expensive for large models, so most methods rely on simple diagonal approximations. While efficient, this approach ignores parameter correlations, often resulting in reduced performance on downstream tasks. In this work, we mitigate these limitations and propose Generalized Fisher-Weighted SVD (GFWSVD), a post-training LLM compression technique that accounts for both diagonal and off-diagonal elements of the Fisher information matrix, providing a more accurate reflection of parameter importance. To make the method tractable, we introduce a scalable adaptation of the Kronecker-factored approximation algorithm for the observed Fisher information. We demonstrate the effectiveness of our method on LLM compression, showing improvements over existing compression baselines. For example, at a 20 compression rate on the MMLU benchmark, our method outperforms FWSVD, which is based on a diagonal approximation of the Fisher information, by 5 percent, SVD-LLM by 3 percent, and ASVD by 6 percent compression rate.

artificial intelligence, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2505.17974

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Low-Rank Matrix Approximation for Neural Network Compression

Cherukuri, Kalyan, Lala, Aarav

arXiv.org Artificial IntelligenceMay-13-2025

Deep Neural Networks (DNNs) have encountered an emerging deployment challenge due to large and expensive memory and computation requirements. In this paper, we present a new Adaptive-Rank Singular Value Decomposition (ARSVD) method that approximates the optimal rank for compressing weight matrices in neural networks using spectral entropy. Unlike conventional SVD-based methods that apply a fixed-rank truncation across all layers, ARSVD uses an adaptive selection of the rank per layer through the entropy distribution of its singular values. This approach ensures that each layer will retain a certain amount of its informational content, thereby reducing redundancy. Our method enables efficient, layer-wise compression, yielding improved performance with reduced space and time complexity compared to static-rank reduction techniques.

arsvd, artificial intelligence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2504.20078

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Optimizing Singular Spectrum for Large Language Model Compression

Li, Dengjie, Shen, Tiancheng, Zhou, Yao, Yang, Baisong, Liu, Zhongying, Yang, Masheng, Ghanem, Bernard, Yang, Yibo, Zhong, Yujie, Yang, Ming-Hsuan

arXiv.org Artificial IntelligenceFeb-20-2025

Large language models (LLMs) have demonstrated remarkable capabilities, yet prohibitive parameter complexity often hinders their deployment. Existing singular value decomposition (SVD) based compression methods simply deem singular values as importance scores of decomposed components. However, this importance ordered by singular values does not necessarily correlate with the performance of a downstream task. In this work, we introduce SoCo (Singular spectrum optimization for large language model Compression), a novel compression framework that learns to rescale the decomposed components of SVD in a data-driven manner. Concretely, we employ a learnable diagonal matrix to assign importance scores for singular spectrum and develop a three-stage training process that progressively refines these scores from initial coarse compression to fine-grained sparsification-thereby striking an effective balance between aggressive model compression and performance preservation. Thanks to the learnable singular spectrum, SoCo adaptively prunes components according to the sparsified importance scores, rather than relying on the fixed order of singular values. More importantly, the remaining components with amplified importance scores can compensate for the loss of the pruned ones. Experimental evaluations across multiple LLMs and benchmarks demonstrate that SoCo surpasses the state-of-the-art methods in model compression.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.15092

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments

Wang, Xinghao, Wang, Pengyu, Wang, Bo, Zhang, Dong, Zhou, Yunhua, Qiu, Xipeng

arXiv.org Artificial IntelligenceOct-31-2024

Large language models (LLMs) have revolutionized numerous applications, yet their deployment remains challenged by memory constraints on local devices. While scaling laws have enhanced LLM capabilities, the primary bottleneck has shifted from capability to availability, emphasizing the need for efficient memory management. Traditional compression methods, such as quantization, often require predefined compression ratios and separate compression processes for each setting, complicating deployment in variable memory environments. In this paper, we introduce BitStack, a novel, training-free weight compression approach that enables megabyte-level trade-offs between memory usage and model performance. By leveraging weight decomposition, BitStack can dynamically adjust the model size with minimal transmission between running memory and storage devices. These blocks are sorted and stacked in storage as basic transmission units, with different quantities loaded based on current memory availability. Extensive experiments across a wide range of tasks demonstrate that, despite offering finegrained size control, BitStack consistently matches or surpasses strong quantization baselines, particularly at extreme compression ratios. To the best of our knowledge, this is the first decomposition-based method that effectively bridges the gap to practical compression techniques like quantization. Figure 1: BitStack enables LLMs to dynamically adjust their size in variable memory environments (a) at a megabyte-level, while still matching or surpassing the performance of practical compression methods such as GPTQ (Frantar et al., 2022) and AWQ (Lin et al., 2024) with the same memory footprint(b). Large language models (LLMs) have demonstrated superior performance on various benchmarks (Achiam et al., 2023; Dubey et al., 2024) and are increasingly serving as practical assistants in people's daily lives, such as general language assistants (OpenAI, 2024; Google, 2024; Anthropic, 2024), search engines (Perplexity.AI, 2024), and code assistants (GitHub, 2024). With the blessing of scaling laws (Kaplan et al., 2020), LLMs are becoming more powerful as their sizes expand, and the main bottleneck for deploying task-capable LLMs has shifted from their capability to their availability.

bitstack, compression ratio, residual block, (13 more...)

arXiv.org Artificial Intelligence

2410.23918

Country: